LUGS: A Scalable Non-parametric Data Synthesizer for Privacy-Preserving Health Data Publiction
ثبت نشده
چکیده
This paper introduces a non-parametric data synthesizing algorithm to generate privacysafe “realistic but not real” synthetic health data. The proposed algorithm synthesizes artificial records while preserving the statistical characteristics of the original data to the extent possible. The risk from “database linking attack” is quantified by an l-diversified data generation process. Moreover its algorithmic performance is optimized using Locality-Sensitive Hashing and parallel computation techniques to yield a linear-time algorithm that is suitable for Big Health data applications. We synthesize a public Medicare claim dataset using the proposed algorithm, and demonstrate multiple data mining applications and statistical analyses using the data. The synthetic dataset delivers results that are substantially identical to those obtained from the original dataset, without revealing the actual records.
منابع مشابه
Privacy and Security of Big Data in THE Cloud
Big data has been arising a growing interest in both scien- tific and industrial fields for its potential value. However, before employing big data technology into massive appli- cations, a basic but also principle topic should be investigated: security and privacy. One of the biggest concerns of big data is privacy. However, the study on big data privacy is still at a very early stage. Many or...
متن کاملPrivacy and Security of Big Data in THE Cloud
Big data has been arising a growing interest in both scien- tific and industrial fields for its potential value. However, before employing big data technology into massive appli- cations, a basic but also principle topic should be investigated: security and privacy. One of the biggest concerns of big data is privacy. However, the study on big data privacy is still at a very early stage. Many or...
متن کاملA centralized privacy-preserving framework for online social networks
There are some critical privacy concerns in the current online social networks (OSNs). Users' information is disclosed to different entities that they were not supposed to access. Furthermore, the notion of friendship is inadequate in OSNs since the degree of social relationships between users dynamically changes over the time. Additionally, users may define similar privacy settings for their f...
متن کاملBig Data Processing with Privacy Preserving Using Map Reduce on Cloud
A large number of cloud services require users to share private data like electronic health records for data analysis or mining, bringing privacy concerns. Anonymizing data sets via generalization to satisfy certain privacy requirements such as k-anonymity is a widely used category of privacy preserving techniques. At present, the scale of data in many cloud applications increases tremendously ...
متن کاملAttribute-based Access Control for Cloud-based Electronic Health Record (EHR) Systems
Electronic health record (EHR) system facilitates integrating patients' medical information and improves service productivity. However, user access to patient data in a privacy-preserving manner is still challenging problem. Many studies concerned with security and privacy in EHR systems. Rezaeibagha and Mu [1] have proposed a hybrid architecture for privacy-preserving accessing patient records...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013